首页> 外文OA文献 >Speech enhancement in modulation domain using codebook-based speech and noise estimation
【2h】

Speech enhancement in modulation domain using codebook-based speech and noise estimation

机译:调制域中的语音增强,使用基于码本的语音和噪声估计

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Conventional single-channel speech enhancement methods implement the analysis-modification-synthesis (AMS) framework in the acoustic frequency domain. Recently, it has been shown that the extension of this framework to the modulation domain may result in better noise suppression. However, this conclusion has been reached by relying on a minimum statistics approach for the required noise power spectral density (PSD) estimation.Various noise estimation algorithms have been proposed over the years in the speech and audio processing literature. Among these, the widely used minimum statistics approach is known to introduce a time frame lag in the estimated noise spectrum. This can lead to highly inaccurate PSD estimates when the noise behaviour rapidly changes with time, i.e., non-stationary noise. Speech enhancement methods which employ these inaccurate noise PSD estimates tend to perform poorly in the noise suppression task, and in worst cases, may end up deteriorating the noisy speech signal even further. Noise PSD estimation algorithms using a priori information about the noise statistics have been shown to track non-stationary noise better than the conventional algorithms which rely on the minimum statistics approach.In this thesis, we perform noise suppression in the modulation domain with the noise and speech PSD derived from an estimation scheme which employs the a priori information of various speech and noise types.Specifically, codebooks of gain normalized linear prediction coefficients obtained from training on various speech and noise files are used as the a priori information while performing the estimation of the desired PSD.The PSD estimates derived from this codebook approach are used to obtain a minimum mean square error (MMSE) estimate of the clean speech modulation magnitude spectrum, which is then combined with the phase spectrum of the noisy speech to recover the enhanced speech signal. The enhanced speech signal is subjected to various objective experiments for evaluation. Results of these evaluations indicate improvement in noise suppression with the proposed codebook-based modulation domain approach over competing approaches, particularly in cases of non-stationary noise.
机译:常规的单通道语音增强方法在声频域中实现了分析-修改-合成(AMS)框架。最近,已经表明,将该框架扩展到调制域可以导致更好的噪声抑制。然而,依靠最小统计方法来估计所需的噪声功率谱密度(PSD)估计可以得出这一结论。多年来,语音和音频处理文献中提出了各种噪声估计算法。其中,众所周知,广泛使用的最小统计方法会在估计的噪声频谱中引入时间框架滞后。当噪声行为随时间快速变化(即非平稳噪声)时,这可能导致PSD估计值非常不准确。使用这些不准确的噪声PSD估计的语音增强方法在噪声抑制任务中往往表现不佳,在最坏的情况下,可能最终使带噪语音信号进一步恶化。与基于最小统计量方法的常规算法相比,使用关于噪声统计量的先验信息的噪声PSD估计算法已显示出更好的跟踪非平稳噪声的能力。在本文中,我们在调制域中使用噪声和语音PSD是从采用各种语音和噪声类型的先验信息的估计方案得出的。具体地说,在对语音和噪声文件进行估计时,将从对各种语音和噪声文件进行训练获得的增益归一化线性预测系数的码本用作先验信息。从此码本方法得出的PSD估计值用于获得干净语音调制幅度谱的最小均方误差(MMSE)估计值,然后将其与嘈杂语音的相位谱组合以恢复增强的语音信号。对增强的语音信号进行各种客观实验以进行评估。这些评估的结果表明,与竞争性方法相比,所提出的基于码本的调制域方法在噪声抑制方面有所改善,尤其是在非平稳噪声的情况下。

著录项

  • 作者

    Mani, Vidhyasagar;

  • 作者单位
  • 年度 2016
  • 总页数
  • 原文格式 PDF
  • 正文语种 en
  • 中图分类

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号